How Greek the Web Is
نویسندگان
چکیده
Internet, apart from a huge repository of information of any kind, has become the main means of modern communications and World Wide Web has emerged as a new sort of society since it usually reflects almost all aspects of modern societies in terms of their economic, political and social status and structure. Therein, over wired and wireless connections, through ingenious ideas, i.e., algorithms, that exploit the enclosed computational power, a new kind of culture emerges combing elements from existing traditional civilizations/cultures like for instance history, arts, science and technology, education, language... Motivated by the fundamental and influential nature of the Greek language, our paper investigates its influence in written texts hosted in the World Wide Web. Otherwise stated, our work addresses the question: How Greek the Web is? Our approach lies in automatically detecting and measuring the frequency of words of Greek origin in user-selected URLs; we focused on URLs including English text – but our work can be (easily) extended to URLs containing text in other languages. To this aim, we designed and implemented using python a cultural algorithm which, starting with a small collection of Greek lemmata and exemplars, is able to automatically generate and recognize new lemmata and English words of Greek origin in web texts. In addition, we designed and implemented a python-based application which using our cultural algorithm analyzes user-selected web texts in terms of content of Greek origin and visualizes analysis results. The application has been tested on a collection of web texts coming from education, development, science and technology indicating that, on average, 10% of the English words used is of Greek origin.
منابع مشابه
The Discursive Construction of Ethnic Identities: The Case of Greek-Cypriot Students
This study examines how Greek-Cypriot students aged 12 to 18, an understudied group of students, construct their ethnic identity in a complex setting such as Cyprus and what motivates the students in the selection of ethnic identity labels. The choice to focus on students aged 12-18 was made on the hypothesis that young children, who did not experience the 1974 war in Cyprus, may have a differe...
متن کاملModeling and Querying Greek Legislation Using Semantic Web Technologies
In this work, we study how legislation can be published as open data using semantic web technologies. We focus on Greek legislation and show how it can be modeled using ontologies expressed in OWL and RDF, and queried using SPARQL. To demonstrate the applicability and usefulness of our approach, we develop a web application, called Nomothesia, which makes Greek legislation easily accessible to ...
متن کاملText extraction and Web searching in a non-Latin language
Recent studies of queries submitted to Internet Search Engines have shown that non-English queries and unclassifiable queries have nearly tripled during the last decade. Most search engines were originally engineered for English. They do not take full account of inflectional semantics nor, for example, diacritics or the use of capitals which is a common feature in languages other than English. ...
متن کاملQuerying the Greek Web in Greeklish
In this paper, we experimentally study the problem of querying the web in a hybrid language, namely Greeklish. Greeklish is the transliteration of Greek in Latin characters of the ASCII code. Although Greeklish emerged as a convenient mean for the creation and distribution of digital data at a time when Unicode Transformation Format was not supported for the Greek alphabet, nevertheless it is s...
متن کاملElicitation Strategies for Web Application Using Activity Theory
Requirements engineering (RE) is often seen as an essential facet in software development. It is a vital process before each project starts. In the context of systems engineering, an understanding and application of systems theory and practice is also relevant to RE. The contexts in which RE takes place habitually involve human activities. Therefore, RE needs to be sensitive to how people perce...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013